The Tao of Inference in Privacy-Protected Databases

نویسندگان

  • Vincent Bindschaedler
  • Paul Grubbs
  • David Cash
  • Thomas Ristenpart
  • Vitaly Shmatikov
چکیده

To protect database confidentiality even in the face of full compromise while supporting standard functionality, recent academic proposals and commercial products rely on a mix of encryption schemes. The common recommendation is to apply strong, semantically secure encryption to the “sensitive” columns and protect other columns with property-revealing encryption (PRE) that supports operations such as sorting. We design, implement, and evaluate a new methodology for inferring data stored in such encrypted databases. The cornerstone is the multinomial attack, a new inference technique that is analytically optimal and empirically outperforms prior heuristic attacks against PRE-encrypted data. We also extend the multinomial attack to take advantage of correlations across multiple columns. These improvements recover PRE-encrypted data with sufficient accuracy to then apply machine learning and record linkage methods to infer the values of columns protected by semantically secure encryption or redaction. We evaluate our methodology on medical, census, and union-membership datasets, showing for the first time how to infer full database records. For PRE-encrypted attributes such as demographics and ZIP codes, our attack outperforms the best prior heuristic by a factor of 16. Unlike any prior technique, we also infer attributes, such as incomes and medical diagnoses, protected by strong encryption. For example, when we infer that a patient in a hospital-discharge dataset has a mental health or substance abuse condition, this prediction is 97% accurate.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improved Univariate Microaggregation for Integer Values

Privacy issues during data publishing is an increasing concern of involved entities. The problem is addressed in the field of statistical disclosure control with the aim of producing protected datasets that are also useful for interested end users such as government agencies and research communities. The problem of producing useful protected datasets is addressed in multiple computational priva...

متن کامل

رعایت «حریم بیماران» توسط تیم درمان و ارتباط آن با رضایتمندی بیماران در بخش اورژانس بیمارستان

Background & Objective: Privacy is a basic humanity principle. Protecting patients;apos privacy is a necessity in health care organizations and along with the patients;apos satisfaction, is one of the main indicators of quality of care. The objective of this study was to assess patients;apos privacy protecting by medical staff and its relation to patients;apos satisfaction.Methods & Materials: ...

متن کامل

ارایه یک روش جدید انتشار داده‌ها با حفظ محرمانگی با هدف بهبود دقّت طبقه‌‌بندی روی داده‌های گمنام

Data collection and storage has been facilitated by the growth in electronic services, and has led to recording vast amounts of personal information in public and private organizations databases. These records often include sensitive personal information (such as income and diseases) and must be covered from others access. But in some cases, mining the data and extraction of knowledge from thes...

متن کامل

An Architecture for Security and Protection of Big Data

The issue of online privacy and security is a challenging subject, as it concerns the privacy of data that are increasingly more accessible via the internet. In other words, people who intend to access the private information of other users can do so more efficiently over the internet. This study is an attempt to address the privacy issue of distributed big data in the context of cloud computin...

متن کامل

Security and Privacy for Sensor Databases

The most compelling sensor technology advances of this decade are deploying wireless networks of heterogeneous smart sensor nodes for complex information gathering tasks. Sensors in wireless sensor networks operate under a set of unique and fundamental constraints that make collaborative information gathering tasks challenging. Sensors in the network simultaneously participate in the collaborat...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • IACR Cryptology ePrint Archive

دوره 2017  شماره 

صفحات  -

تاریخ انتشار 2017